Simulation-based inference (SBI) solves statistical inverse problems by repeatedly running a stochastic simulator and inferring posterior distributions from model-simulations. To improve simulation efficiency, several inference methods take a sequential approach and iteratively adapt the proposal distributions from which model simulations are generated. However, many of these sequential methods are difficult to use in practice, both because the resulting optimisation problems can be challenging and efficient diagnostic tools are lacking. To overcome these issues, we present Truncated Sequential Neural Posterior Estimation (TSNPE). TSNPE performs sequential inference with truncated proposals, sidestepping the optimisation issues of alternative approaches. In addition, TSNPE allows to efficiently perform coverage tests that can scale to complex models with many parameters. We demonstrate that TSNPE performs on par with previous methods on established benchmark tasks. We then apply TSNPE to two challenging problems from neuroscience and show that TSNPE can successfully obtain the posterior distributions, whereas previous methods fail. Overall, our results demonstrate that TSNPE is an efficient, accurate, and robust inference method that can scale to challenging scientific models.
translated by 谷歌翻译
有条件神经密度估计器的仿真推断是解决科学逆问题的强大方法。然而,这些方法通常将底层向前模型视为一个黑匣子,没有办法利用等物学,例如协调。协调在科学模型中是常见的,然而将它们直接集成到表达推导网络中(例如标准化流动)并不简单。我们在这里描述了在参数和数据的联合转换下掺入协调的替代方法。我们的方法 - 称为组等级神经后后估计(GNPE) - 基于自始终标准化数据的“姿势”,同时估计在参数上后部。它是独立的架构,并适用于精确和近似的协调。作为现实世界的应用,我们使用GNPE从引力波观测到Astrophysical Block Block Systems的摊销推理。我们表明GNPE实现了最先进的准确性,同时减少了三个数量级的推理时间。
translated by 谷歌翻译
部署到现实世界的自主智能代理必须与对感官输入的对抗性攻击保持强大的态度。在加强学习中的现有工作集中于最小值扰动攻击,这些攻击最初是为了模仿计算机视觉中感知不变性的概念。在本文中,我们注意到,这种最小值扰动攻击可以由受害者琐碎地检测到,因为这些导致观察序列与受害者的行为不符。此外,许多现实世界中的代理商(例如物理机器人)通常在人类主管下运行,这些代理商不容易受到这种扰动攻击的影响。结果,我们建议专注于幻觉攻击,这是一种与受害者的世界模式一致的新型攻击形式。我们为这个新颖的攻击框架提供了正式的定义,在各种条件下探索了其特征,并得出结论,代理必须寻求现实主义反馈以对幻觉攻击具有强大的态度。
translated by 谷歌翻译
原始的“七个图案”阐述了科学计算领域的基本方法的路线图,其中图案是一种捕获计算和数据移动模式的算法方法。我们介绍了“仿真智力的九个主题”,是一种开发和整合的路线图,以合并科学计算,科学模拟和人工智能所必需的基本算法。我们称之为合并模拟智能(SI),短暂。我们认为模拟智能的主题是相互连接的和相互依存的,很像操作系统层中的组件一样。使用这种隐喻,我们探讨了模拟智能操作系统堆栈(Si-Stack)和其中图案的各层的性质:(1)多种物理和多尺度建模; (2)替代建模和仿真; (3)基于仿真的推理; (4)因果建模和推理; (5)基于代理的建模; (6)概率编程; (7)可微分的编程; (8)开放式优化; (9)机器编程。我们相信图案之间的协调努力提供了加速科学发现的巨大机会,从综合生物和气候科学中解决逆问题,指导核能实验,并预测社会经济环境中的紧急行为。我们详细说明了Si-stack的每层,详细说明了最先进的方法,提出了示例以突出挑战和机遇,并倡导具体的方法来推进主题和与其组合的协同作用。推进和整合这些技术可以实现稳健且有效的假设仿真 - 分析类型的科学方法,我们用几种使用案例为人机组合和自动化学介绍。
translated by 谷歌翻译
Making histopathology image classifiers robust to a wide range of real-world variability is a challenging task. Here, we describe a candidate deep learning solution for the Mitosis Domain Generalization Challenge 2022 (MIDOG) to address the problem of generalization for mitosis detection in images of hematoxylin-eosin-stained histology slides under high variability (scanner, tissue type and species variability). Our approach consists in training a rotation-invariant deep learning model using aggressive data augmentation with a training set enriched with hard negative examples and automatically selected negative examples from the unlabeled part of the challenge dataset. To optimize the performance of our models, we investigated a hard negative mining regime search procedure that lead us to train our best model using a subset of image patches representing 19.6% of our training partition of the challenge dataset. Our candidate model ensemble achieved a F1-score of .697 on the final test set after automated evaluation on the challenge platform, achieving the third best overall score in the MIDOG 2022 Challenge.
translated by 谷歌翻译
Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.
translated by 谷歌翻译
The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.
translated by 谷歌翻译
Real-life tools for decision-making in many critical domains are based on ranking results. With the increasing awareness of algorithmic fairness, recent works have presented measures for fairness in ranking. Many of those definitions consider the representation of different ``protected groups'', in the top-$k$ ranked items, for any reasonable $k$. Given the protected groups, confirming algorithmic fairness is a simple task. However, the groups' definitions may be unknown in advance. In this paper, we study the problem of detecting groups with biased representation in the top-$k$ ranked items, eliminating the need to pre-define protected groups. The number of such groups possible can be exponential, making the problem hard. We propose efficient search algorithms for two different fairness measures: global representation bounds, and proportional representation. Then we propose a method to explain the bias in the representations of groups utilizing the notion of Shapley values. We conclude with an experimental study, showing the scalability of our approach and demonstrating the usefulness of the proposed algorithms.
translated by 谷歌翻译
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
Participants in political discourse employ rhetorical strategies -- such as hedging, attributions, or denials -- to display varying degrees of belief commitments to claims proposed by themselves or others. Traditionally, political scientists have studied these epistemic phenomena through labor-intensive manual content analysis. We propose to help automate such work through epistemic stance prediction, drawn from research in computational semantics, to distinguish at the clausal level what is asserted, denied, or only ambivalently suggested by the author or other mentioned entities (belief holders). We first develop a simple RoBERTa-based model for multi-source stance predictions that outperforms more complex state-of-the-art modeling. Then we demonstrate its novel application to political science by conducting a large-scale analysis of the Mass Market Manifestos corpus of U.S. political opinion books, where we characterize trends in cited belief holders -- respected allies and opposed bogeymen -- across U.S. political ideologies.
translated by 谷歌翻译